首页> 外文OA文献 >Weighted lexical semantic maps for areal lexical typology. Verbs of perception and cognition as a case study
【2h】

Weighted lexical semantic maps for areal lexical typology. Verbs of perception and cognition as a case study

机译:区域词汇类型学的加权词汇语义图。案例研究中的感知和认知动词

摘要

This paper aims to contribute to Distributional Typology, whose explicit aim is to investigate linguistic diversity directly (“what’s where why?”, Bickel 2007), by investigating the typology of (co-)lexicalization patterns using a bottom-up approach to semantic maps. Specifically, we propose a new method for constructing semantic maps on the basis of massive cross-linguistic data, in order to evaluate the effects of (i) inheritance, (ii) language contact, and (iii) other environmental and cultural factors on patterns of polysemy and co-lexicalization. This method allows a fine-grained analysis of the factors that lead to the effects identified by areal lexico-semantics (Koptjevskaja-Tamm & Liljegren, 2017). The semantic map model was initially created in order to describe the polysemy patterns of grammatical morphemes (see Cysouw, Haspelmath, & Malchukov, 2010 for an overview). Although studies using the model cover a wide range of linguistic phenomena, the majority pertained to the domain of grammar (e.g., Haspelmath, 1997; van der Auwera & Plungian, 1998). However, recent studies by François (2008), Perrin (2010), Wälchli and Cysouw (2012), Rakhilina and Reznikova (2016), Youn et al. (2016) and Georgakopoulos et al. (2016) have shown that the model can fruitfully be extended to lexical items. The common denominator in both lines of research is that the semantic maps were usually plotted manually, which, is particularly problematic for large-scale typological studies.In this paper, we show that existing synchronic polysemy data in large language samples, such as ASJP (Wichmann et al., 2016), CLICS (List et al., 2014), and the Open Multilingual Wordnet (Bond & Paik, 2012) can be turned into lexical matrices using Python scripts. From these lexical matrices, one can infer large-scale weighted classical lexical semantic maps, using an adapted version of the algorithm introduced by Regier, Khetarpal, and Majid (2013). This approach is innovative in several respects. First, lexical semantic maps are automatically plotted and inferred directly from a significant amount of cross-linguistic data (cf. Youn et al., 2016). Second, unlike other types of polysemy networks in the field, these maps are structured – respecting the connectivity hypothesis (Croft, 2001) and what we call the ‘economy principle’. As such, they generate more interesting implicational universals and can be falsified based on additional empirical evidence. Finally, weighted lexical semantic maps allow exploring the frequency of polysemy patterns and shared lexicalizations from both a semasiological and an onomasiological perspective, which is hardly achievable with other methods.We apply this method to a case study of verbs of perception and cognition (see Appendix for a provisional semantic map) and we enrich the result with additional cross-linguistic data (Zalziniak et al., 2012). The semantic map method allows one to visualize a structured cross-linguistic polysemy network, and to systematically analyze the types of mapping of lexical items onto this network. More specifically, the method allows one to differentiate between common polysemy patterns attested in unrelated languages and shared polysemy patterns, that is colexification patterns shared among languages in the same area. These results will be compared to (i) geographical and genetic data in order to determine the interaction between lexicalization patterns and areality, on the one hand, and common inheritance, on the other. Our findings will also be compared to (ii) proposed universal generalizations, in order to evaluate their validity and limits, and to (iii) proposed language/culture-specific associations identified in the literature (e.g., Viberg, 1984; Sweetser, 1990; Evans & Wilkins, 2000; Aikhenvald & Storch, 2013), in order to evaluate the degree to which the bottom-up method relying on large language samples matches the results of case-studies conducted by experts.
机译:本文旨在为分布类型学做出贡献,其明确目的是通过使用自下而上的语义图调查(共)词化模式的类型,直接研究语言多样性(“为什么在哪里?”,Bickel 2007)。 。具体而言,我们提出了一种基于大量跨语言数据构建语义图的新方法,以评估(i)继承,(ii)语言接触和(iii)其他环境和文化因素对模式的影响多义和共词汇化。这种方法可以对导致区域词汇语义学识别的因素进行细粒度分析(Koptjevskaja-Tamm&Liljegren,2017)。最初创建语义图模型是为了描述语法语素的多义模式(有关概述,请参见Cysouw,Haspelmath和Malchukov,2010年)。尽管使用该模型的研究涵盖了广泛的语言现象,但大多数都属于语法领域(例如Haspelmath,1997; van der Auwera&Plungian,1998)。然而,François(2008),Perrin(2010),Wälchli和Cysouw(2012),Rakhilina和Reznikova(2016),Youn等人最近的研究。 (2016)和Georgakopoulos等。 (2016年)表明,该模型可以有效地扩展到词汇项目。两条研究的共同点是语义图通常是手动绘制的,这对于大规模的类型学研究尤其成问题。本文证明了在大语言样本中现有的共时多义数据,例如ASJP( Wichmann等人,2016),CLCS(List等人,2014)和开放式多语言Wordnet(Bond&Paik,2012)可以使用Python脚本转换成词法矩阵。从这些词汇矩阵中,可以使用Regier,Khetarpal和Majid(2013)引入的算法的改编版,推断出大规模加权经典词汇语义图。这种方法在几个方面都是创新的。首先,自动绘制词汇语义图,并直接从大量的跨语言数据中直接推断出来(参见Youn等人,2016)。其次,与该领域的其他类型的多义网络不同,这些地图是结构化的-尊重连通性假设(Croft,2001年)以及我们所谓的“经济原理”。因此,它们产生了更多有趣的暗示普遍性,并且可以基于其他经验证据进行伪造。最后,加权词法语义图允许从信号学和本体论角度探索多义模式和共享词法化的频率,这是其他方法难以实现的。我们将此方法应用于感知和认知动词的案例研究(请参阅附录)临时语义图),我们会在结果中添加其他跨语言数据(Zalziniak等,2012)。语义映射方法允许人们可视化结构化的跨语言多义网络,并系统地分析词汇项目到该网络的映射类型。更具体地,该方法允许人们区分以不相关的语言证明的常见的多义模式和共享的多义模式,即在同一区域的语言之间共享的共通模式。将这些结果与(i)地理和遗传数据进行比较,以便一方面确定词汇化模式和领域之间的相互作用,另一方面确定普通继承之间的相互作用。我们的发现也将与(ii)提议的普遍概括进行比较,以评估其有效性和局限性,并与(iii)在文献中确定的提议的语言/文化特定关联进行比较(例如,Viberg,1984; Sweetser,1990; Scott,1990)。 Evans和Wilkins,2000; Aikhenvald和Storch,2013),以评估依靠大语言样本的自下而上方法与专家案例研究结果的匹配程度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号